asymptotic direction
Country:
- Asia > China > Liaoning Province > Dalian (0.04)
- North America > United States > Ohio > Franklin County > Columbus (0.04)
- North America > Canada (0.04)
Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Country:
- Asia > China > Liaoning Province > Dalian (0.04)
- North America > United States > Ohio > Franklin County > Columbus (0.04)
- North America > Canada (0.04)
Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
The Implicit Bias of AdaGrad on Separable Data
In recent years, implicit regularization from various optimization algorithms plays a crucial role in the generalizatiion abilities in training deep neural networks [Salakhutdinov and Srebro, 2015, Neyshabur et al., 2015, Keskar et al., 2016, Neyshabur et al., 2017, Zhang et al., 2017]. For example, in underdetermined problems where the number of parameters is larger than the number of training examples, many global optimum fail to exhibit good generalization properties, however, a specific optimization algorithm (such as gradient descent) does converge to a particular optimum that generalize well, although no explicit regularization is enforced when training the model.
1906.03559
Country:
- Asia > China > Liaoning Province > Dalian (0.04)
- North America > United States > Ohio > Franklin County > Columbus (0.04)
Technology: